home *** CD-ROM | disk | FTP | other *** search
- WHAT IS AN ALN?
-
- An ALN (adaptive logic network) is a kind of artificial neural
- network. It is a special case of the familiar multilayer perceptron
- (MLP) feedforward network and has been designed to perform basic
- pattern classification computations at extremely high speed --
- literally at the upper limit of speed for any digital system.
-
- WHAT COULD ONE USE ALNS FOR?
-
- They are currently being tried out in simulations using the atree
- software for various applications:
-
- 1. Optical Character Recognition (OCR) -- a demo of OCR is included in
- atree release 2.7; the great immunity of ALN trees to noise is made clear
- (Thomas)
-
- 2. Discriminating between two kinds of particles produced by an
- accelerator, where a decision must be made in less than 1/2
- microsecond
- (Volkemer).
-
- 3. Designing control systems for prostheses to enable people with
- spinal cord damage to walk by means of functional electrical stimulation.
- Simplicity of fabrication, safety and lightness are important here.
- (Stein, et al.)
-
- 4. Analyzing spectra of tarsands to determine the composition; data
- analysis was used to cut down the number of spectral measurements
- required from 700 to 25.
- (Parsons).
-
- 5. Classifying the amount of fat in beef based on texture measures
- taken of ultrasound images of beef cattle.
- (McCauley et al.)
-
- INPUTS
-
- The following assumes the reader has read something about typical
- feed-forward neural nets. In the case of ALNs, the input data to the
- learning system must be boolean. Non-boolean data is coded into
- boolean vectors before entering the learning part. This may involve
- arithmetic and table lookup operations. Several boolean outputs may
- be used to provide outputs which are more complex than boolean, say
- continuous values. There are no a priori limitations on the kind of
- functions that can be treated using ALNs.
-
- ARCHITECTURE OF THE NET
-
- The interconnections of nodes usually take the form of a binary tree,
- except that there are multiple connections of inputs to the leaf nodes
- of the tree, some involving inversions. There is no limit to the
- number of hidden layers in practice. Ten or fifteen is typical.
- A small net might have 255 nodes, a large one 65,535 nodes.
-
- One doesn't have to worry about the shape of the net as much as with
- other networks.
-
- THE ELEMENTS OF THE NEURAL NETWORK
-
- The nodes (or adaptive logic elements) have two input leads. The input
- signals x1, x2 are boolean ( 0 or 1). A connection "weight" to be
- used for each input ( to use the term for the usual kind of neural
- net) is determined by a single bit of information. The nonlinearity,
- or "squashing function", used in ALNs is just a simple threshold
- (comparison to a constant). Specifically, if b1 and b2 are boolean,
- then the element computes a boolean value which is 1 if and only if
-
- (b1 + 1) * x1 + (b2 + 1) * x2 >= 2.
-
- The four combinations of b1 and b2 ( 00, 11, 10, 01) generate the four
- boolean functions of two variables: AND, OR, LEFT, and RIGHT, where
- LEFT(x1, x2) = x1 and RIGHT(x1, x2) = x2. For example 1 * x1 + 1 * x2
- >= 2 if and only if both x1 AND x2 are 1.
-
- The LEFT and RIGHT functions, in effect, serve to provide some
- flexibility in the interconnections. They disappear after training is
- complete.
-
- Two counters in the node count up and down in response to 0s and 1s
- desired of the network when the input to the node is 01 or 10
- respectively. They try to determine the best function for the node
- even in the presence of noise and conflicting training data. The
- counters also disappear after training is complete.
-
- THE TRAINING OF THE TREE
-
- A tree of such elements is randomly connected to some boolean input
- variables and their complements. Inputs enter the tree at the leaves
- and the output is at the root node. The input variables are
- components of a vector representing a pattern, such as a handwritten
- character. Training on a set of input vectors, furnished with the
- desired boolean outputs, consists of presenting input vectors in
- random sequence, along with the desired outputs. It results in an
- assignment of functions to nodes that allows the tree to approximate
- the desired outputs specified by the training data.
-
- To orchestrate the changes to be made during training, a solution to
- the credit assignment problem is used. It is based on the idea that
- if one input to an AND (OR) is 0 (1), then it doesn't matter what the
- nodes of the subtree attached to the other input are doing. According
- to this simple rule, a node is responsible if a change in its output
- would change the network output. (This is like backprop, except that
- the quantity being propagated backwards is either 0 or 1.)
-
- The fact that all node functions are increasing causes a positive
- correlation between the node output and the network output, which
- indicates the direction of changes that must be made. In addition,
- there is a small refinement which goes beyond this simple scheme. (You
- can look at the function starting
-
- adapt(tree,vec,desired)
-
- in atree.c to confirm that this is a very
- simple yet effective mechanism).
-
- After training, the ALN then has built-in capacity to generalize, i.e.
- to give appropriate responses to other possible input vectors not
- presented during training. This comes about not because of any
- additional hardware or software, but just because of the architecture
- of the tree -- trees filter out perturbations of the inputs. Hence no
- speed is lost in order to generalize well -- the architecture does it.
-
- The training set can contain contradictory samples, and the relative
- numbers of such conflicting samples for the same input will be taken into
- account in the training procedure to approximate maximum likelihood and
- Bayes decision rules.
-
- BUILDING HARDWARE FROM THE TRAINED SYSTEM
-
- Training an ALN results in a combinational logic circuit. After
- adaptation, one eliminates LEFT and RIGHT functions, which are now
- redundant and groups the connected subtrees of two-input ANDs together
- to get ANDs of fanin >= 2. Similarly for ORs. The (alternating)
- layers of ANDs and ORs are logically equivalent to layers consisting
- entirely of NANDs. (Use De Morgan's Law.)
-
- The result is directly translatable into easily built hardware --
- essentially a tree of transistors. This is far, far simpler than any
- other neural system. There are no built in delays, flip-flops,
- arithmetic units or table look-ups to slow the system down.
-
- HARDWARE SPEED
-
- The execution time of such a circuit depends on the depth, not the
- size. If we were using a twenty layer tree of transistors in
- submicron CMOS, the entire function would be executed in a few
- nanoseconds.
-
- In the usual terms of numbers of one-bit connections per second, we
- would get many trillions of connections per second from an ALN,
- thousands of times more than existing special hardware systems for
- implementing backprop. The fastest we know of on the market for
- backprop only allows a few billions of connections per second at peak
- rate.
-
- COST OF HARDWARE
-
- For small problems, even one of the off-the-shelf programmable chips that exist
- today can do the job, though at a slower rate than a custom chip.
- Such fast, inexpensive hardware is being investigated at an accelerator
- facility for making a fast trigger to recognize elementary particles in
- under 1/2 microsecond. A few hundred dollars should cover the materials.
-
- This is in sharp contrast to other kinds of neural networks where you
- need to have processors costing thousands of dollars just to get into
- the game seriously, and even then the result of all their expensive,
- learning will be a system that executes far more slowly than an ALN
- tree of transistors. With ALNs, you are seriously in the game from
- day one with the atree simulators.
-
- SPEED OF THE ATREE SIMULATOR
-
- Even in the atree software simulator, evaluation is fast, thanks to
- the fact that boolean logic can be lazily evaluated. Once one input
- to an AND gate is computed to be 0, we don't need the other input, so
- we can suppress a whole subtree of the computation. Laziness can
- provide speedups by factors of 10 or 100 for typical problems. On the
- other hand, backprop systems do not enjoy such speedups. Non-logical
- MLPs compute *everything*. It's like doing programming without any
- "if"-statements. (It's no wonder the usual neural networks need
- massive parallelism -- they suffer from this serious omission.)
-
- TRAINING SPEED
-
- The adaptation algorithms are also designed to be fast: what
- corresponds to backprop, a credit assignment computation, is also
- combinational logic in hardware, so it will be possible to build systems
- which will adapt a tree at a rate of, say, 25 million patterns per
- second.
-
- The difference in speed of training and execution of backprop systems
- on the one hand, and ALN systems on the other, results from the
- drastic simplification of the usual MLP paradigm to base it on boolean
- logic. The difference is immense. One shouldn't think of it as
- getting a racing car from a station wagon (twice as fast, say), it is
- more like getting a rocket from a hot air balloon (thousands of times
- faster). As the problems get larger, the advantage of the ALN system
- over the usual MLP just keep on growing.
-
- GENERALIZATION AND NOISE IMMUNITY
-
- Good generalization without loss of speed is the strong point of
- ALNs. Pattern classification problems generally require an output
- which is the same for all patterns in a class. Members of a class are
- in some way similar. Similarity could be based on all sorts of
- criteria, but generally it can be defined by saying two individuals
- have many features in common. If a feature is represented by a
- boolean variable, then the above says, similarity can be measured by
- Hamming distance between boolean vectors. In order to use ALNs, we
- generally have to code other representations of individuals into
- boolean vectors in such a way that similarity for pattern
- classification purposes is reflected in Hamming distance.
-
- Binary trees with node functions AND, OR, LEFT, and RIGHT have been
- shown to have the property that the functions they realize tend to
- have the same output value when the inputs are perturbed. This
- depends on averaging over all functions on the tree, weighted by the
- number of ways each can be produced by setting the two bits at each
- node which determine the node's function. It is also necessary to
- average over all vectors and all perturbations of a given number of
- bits.
-
- The plausible reasoning behind this is that when an input signal to a
- gate changes, it may or may not change the output. A change to an
- input to an AND will not change its output if the other input to the
- AND is 0. Over all, the probability of a change is 0.5, if one input
- changes. So for a 10-layer tree, the probability of a single bit
- change at a leaf causing a change of the tree output is 1/1024.
-
- ALNs match the solution to the problem to provide fast pattern
- recognition. Simple properties of boolean logic provide speed,
- immunity to noise, and offer the possibility of lazy evaluation.
-
- THE FUTURE
-
- In the Fall of 1992, we expect to have a much enhanced version,
- embodying some recent developments, that have resulted in part thanks
- to interaction with people on Internet via email and news. It will
- have
-
- - superior adaptive algorithms for both continuous and boolean data
-
- - an ability to deal with continuous values, without significant loss
- in speed, that will make other MLPs obsolete, including those using variants of backprop
-
- - a sound statistical basis for fitting continuous functions
-
- - a design methodology that will be easy to understand and will result in
- provably *safe* systems; there will be no more problem of wild values being
- produced by a "black box", so ALNs can be applied in safety-critical areas
-
- - an interface to common database management systems on micros and mainframes
-
- - an interface to a popular spreadsheet program under Windows.
-
- - facilities to encourage developers to apply ALNs to pattern recognition
- problems: OCR, voice, control of virtual realities etc.
-
- - a price, since we can't carry on otherwise.
-
- AN INVITATION
-
- I hope the above arguments will be enough to convince the reader to
- try the atree release 2.0 software for Unix and/or the new release f2.7
- for Windows on the IBM PC and compatibles. At the very least, you
- should take a look at the OCR demo in release 2.7 to see the great
- noise immunity which results from such simple means.
-
- The lf language should make it easy to get started, and there are lots
- of examples in release 2.7. (You could look at them even if you only
- want the Unix version.)
-
- If you like what you see, work with us. Lots of researchers are doing
- so, some via the net: physicists, engineers, neuroscientists, speech
- pathologists, ...
-
- Join the alnl mailing list (email to alnl-request@cs.ualberta.ca
- providing a correct address that will last for a while please).
-
- The interest in backprop-type neural networks which has blossomed in
- the last few years is obviously fading fast and funding is less than
- expected. I heard that it is because "neural networks" were just
- slightly better than standard techniques. The problems that need high
- speed pattern recognition are still there, though, and when speed is
- an issue, ALNs are not just slightly better, but thousands of times
- better than anything else.
-
- As for the quality of ALN classifiers, you'll just have to try them
- out in your area of application. For beef grading, at last report,
- ALNs were the best technique they had. A particle physicist, as far
- as I know, is still trying to explain what information an ALN was
- using that managed to learn even when he deprived it of momentum
- information that was thought to be essential. In prosthetics, the ALN
- didn't make a mistake a physiotherapist did, even though the mistake
- was in the training set. A lot of people are pretty happy about their
- results with ALNs.
-
- Good luck with your results using them!
-
- Bill Armstrong
-
- The file for release 2.7 will be announced as soon as it is available.
- Another posting will provide a list of references.
-